Brute Image OCR Tesseract

Example chall

https://mega.nz/file/WkNUXLaK#UNMDnX5-NJXvpx15-U1BvFIDy2jw3WgEgY_IyLi1-vA

Solving using Tesseract

We have image untill image 39003.
Each image is different :
example :
Pasted image 20240226195145.png
Pasted image 20240226195224.png

Create a mass extract using tesseract with base of :

import subprocess

text = ""


for i in range(3903):

image_path = f'chall/image{i}.png' # Update with the actual image path

try:

extracted_text = subprocess.system(['tesseract', image_path, 'stdout', '-c', 'tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789']).decode('utf-8')

print(f'proccess image ke {i}')

print(extracted_text[0])

text += str(extracted_text[0])

except subprocess.CalledProcessError as e:

print(f'Error running Tesseract: {e}')

  
  

print(text)

got an BASE32 OUTPUT and decode it in cyberchef:
cyberchef

OR FAST Solution
Pasted image 20240226204923.png

for i in {0..3903}; do tesseract image$i.png stdout | head -c 1 >> result.txt; done

cat result.txt | base32 -d

and we get the flag